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ABSTRACT 



A new, computationally- and statistically-efficient algorithm, the Fast al- 
gorithm (Fx^), can find a periodic signal with harmonic content in irregularly- 
sampled data with non-uniform errors. The algorithm calculates the minimized 

as a function of frequency at the desired number of harmonics, using Fast 
Fourier Transforms to provide O(A^logA^) performance. The code for a refer- 
ence implementation is provided. 

Subject headings: methods: data analysis — methods: numerical — methods: 
statistical — stars: oscillations 
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Introduction 



A common problem, in astronomy and in other fields, is to detect the presence and 
characteristics of an unknown periodic signal in irregularly-spaced data. Throughout this 
paper I will use examples and language drawn from the study of periodic variable stars, but 
the techniques can be applied to many other situations. 

Generally, detecting periodicity is achieved by comparing the measurements phased at 

each of a set of trial frequencies to a model periodic phase function, $(0), and selecting the 

frequency that yields a high value for a quality function. The commonly-used techniques 

vary in the form and parameterization of $, the evaluation of the fit quality between model 

and data, the set of frequencies searched, and the methods used for computational efficiency. 

I 



The Lomb algorithm (iLomb 



19761 ). for example, uses a sinusoid plus constant for its 
model function. The quality is the amplitude of the unweighted least-squares fit at the trial 
frequency. A simple implementation takes 0{nN) computations , where n is the numbe r 



Press fc Rvbickil fll989h 



of measurements and is the number of frequencies searched, 
present an implementation that uses the Fast Fourier Transform (FFT) to give 0{NlogN) 
performance. Spurious detections can be produced at frequencies where the sample times 
have significant periodicity, power from harmonics above the fundamental sine wave is 
lost, and the algorithm is statistically inefficient in the sense that it ignores point-to-point 
variation in the measurement error. 



Irwin et al. I (120061 ) use a Lomb algorithm to find a candidate period, then test the 
measurements' x^i'sduction between a constant value and a model periodic lightcurve 
obtained by folding the measurements at that period and applying smoothing. 



Phase Dispersion Minimization (jStellingwerf 



19781 ) divides a cycle into (possibly 



overlapping) phase bins. The quality is calculated from the x^agreement among those data 
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points that fall into each bin at the trial frequency. This also has 0{nN) performance. The 
size and number of the bins, and the time of = 0, are choices that can affect the detection 
probability of a particular signal. 

The Fast (hereafter Fx^) technique presented here uses a Fourier series truncated at 
harmonic H: 

^H{{Ao...2Hj},t) = Ao+ J2 A2h-lSm{h2TTft) + A2hCOs{h2TTft) (1) 



for the model periodic function. The fit quality is the x all data , 



h=l...H 

2r 



Baluev 



jointly minimized 



(120081 ) investigates 



over the Fourier coefficients, Aq_2H, and the frequency, /. 
the statistics of these fits. The set of frequencies searched can have arbitrary density 
and range. The computational complexity of this implementation is 0{H N log HN). 
The optimal choice of H depends on the (generally-unknown) true shape of the periodic 
signal, but evaluations with multiple H values can share computational stages, resulting 
in determinations at if' = 1. . .if at only a slight premium. The fit follows standard 
X^statistics and makes efficient use of measurement errors. 



2. Description 

X^minimization is the standard method to fit a data set with unequal, Gaussian, 
measurement errors. (Other Maximum Likelihood methods can handle more general error 
distributions, but are beyond the scope of this paper.) 

A hypothesis may be expressed as a model M(P,t), where P is the set of unknown 
model parameters and t is an independent variable. This is compared to measurements 
Xi, with standard errors o"j, made at tj. The comparison is made by finding the P that 
minimizes 

,^(P) = ^ (mp,t^-^.)\ p 

i=l...n * 
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(In this paper t is a scalar value which, for the variable star case, represents time. 
However, the extension of this technique to spatial or other dimensions, to higher dimension 
such as the {u,v) baselines of interferometry, and to non-scalar measurements, Xi, is 
straightforward. ) 

If this minimum x^is significantly below that for a Null Hypothesis, then this is 
evidence for the model, and for the value of P at the minimum. The extent by which the 
model decreases the x^compared to the Null hypotheses, Ax^ = Xo ~ X^(P)) is known as 
the Minimization Index and, if the Null Hypothesis is true and other conditions apply, has 
a ^^probability distribution with the same number of Degrees Of Freedom as the model. 

Many models of interest are linear in some parameters, and non-linear in others. These 
models can be factored into the linear parameters. A, and a set of functions M(Z,t) of the 
non-linear parameters, Z, so that 

M({A,Z},t) = ^A,M,(Z,t) (3) 

i 

For any given Z, the x^minimized with respect to A can be rapidly found in closed 
form by linear regression. Finding the global minimum of x^thus reduces to searching the 
non- linear parameter space Z. 

The truncated Fourier series function in equation [T] is linear in the coefficients Aq 2h, 
and non-linear in Z = /, with component functions 

Mo(/,t) = 1 (4) 
M2^-i(/,t) = sin h2n ft (5) 
M2hif,t) = cos h27r ft (6) 

Therefore, the minimum x^can be quickly calculated at any chosen frequency. Complete, 
dense coverage of the frequency range of interest is required to find the overall minimum — the 
orthogonality of the Fourier basis tends to make the minima very narrow. 
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Proceeding in the usual manner for linear regression (see e.g, Numerical Recipes 



(jPress et al. 



2002), §15.4 for a review) we produce the a matrix and (3 vector used in the 



'normal equations'. The values of a and (3 depend on the frequency /. 

The a matrix comes from the cross terms of the model components, as a weighted sum 
over the data points: 

o^jk= 2^ —2 (7) 

i=l...n * 

The (3 vector is the weighted sum of the product of the data and the model components: 



1=1. ..n ' 

A frequency-independent scalar, X, which is the x^corresponding to a constant = 
model: 

X = l]x,Vcrf (9) 

i 

is also used in the calculation of x^- 

The minimization of equation [2] over the linear A parameters can be shown to give a 
minimum subject to / of 

'xLn = X-5^5^^/3,ra-VA (10) 

j k 

The Fourier coefficients corresponding to this minimum are 

^A, = Y,i'a'')3k'Pk (11) 

k 

We can use the trigonometric product relations: 

sin(a) sin(6) = i(cos(a — 6) — cos(a + 6)) (12) 

sin(a) cos(6) = |(sin(a + 6) + sin(a — 6)) (13) 

cos(a) sin(6) = |(sin(a + 6) — sin(a — 6)) (14) 

cos(a) cos(6) = i(cos(a — 6) + cos(a + 6)) (15) 
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to reduce the cross-terms of sines and cosines in a to 



Ww-i =\{C\{h-h!)j)-C\{h^h!)j)) (16) 

^a2/.-i,2h' = \{S'{{h + h!)!) + S\{h - h')f)) (17) 

W,2h'-i =i('5'((/i + /i')/)-'5'((/i-/i')/)) (18) 

^«2M^' =i(C'((/i-/i')/) + C^'((/^ + /^')/)) (19) 



while the terms for (3 are 



'P2h-i = S{hf) (20) 
'P2h = cm (21) 



where 



(22) 



C{f) = Ecos{2n fU)x,/af = re P(/) 

Sif) = ZM^^ftiW^f = iniP(/) 

C'if) = Ecos(27rA)M2 = reQ(/) 

S'if) = ZM^^m/crf = imQ(/) 

The functions C, C, S, and S' are the cosine and sine parts of the Fourier transforms 
of the weighted data and the weights: 

Pit) = J:^Sit-t^)^ P{f) = J p{t)e^'-f'dt 

{26} 

lit) = Y.At-ti)^ Qif) = f q{t)e''-f'dt 

i 

The function p{t) is the weighted data, and q{t) is the weight, as a function of time. 
These are both zero at times when there is no data, and Dirac delta functions at the time 
of each measurement. P and Q can be efficiently computed by the application of the FFT 
to p and q. 

This FFT calculation of P and Q, their use in construction of a and (3 for a set of 
frequencies, and the x^minimization by equation [10] for each of those frequencies, are the 
components of the Fx^ algorithm. 
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3. Implementation 

The steps in implementing a Fx^ search are: a) Choosing the search space b) 
Generating the Fourier Transforms c) Calculating the normal equations at each frequency 
d) Finding the minimum e) Interpreting the result 

a) Choosing the search space 

The frequency range of interest, the number of harmonics, and the density of coverage 
may be chosen based on physical expectations, measurement characteristics, processing 
limitations, or other considerations. 

The maximum frequency searched may be where the exposure times of the observations 
are a large fraction of the period of the highest harmonic, or some other upper bound 
placed by experience or the physics of the source. At low frequencies, if there are only a few 
cycles or less of the fundamental over the span of all observations, a large x^decrease may 
be evidence of variability but not necessarily of periodicity. 

It is not necessary that the fitting function accurately represent all aspects of the 
physical process. Sharp features in the actual phase function may require high harmonics 
to reproduce as a Fourier sum, but can be detected with adequate sensitivity using a small 
number of harmonics. Details of the phase function (e.g, the behavior on ingress and egress 
of an eclipsing binary) may be determined by other techniques once a candidate frequency 
has been found. 

The density of the search-how closely the trial frequencies are spaced-affects the 
sensitivity as well. For a simple Fourier analysis of the fundamental, a spacing of one cycle 
over the span of observations can cause a reduction in amplitude to l/-\/{2) if the true 
frequency is intermediate between two trial frequencies. For an analysis to harmonic if, the 
spacing of trial fundamental frequencies must be correspondingly tighter. The maximum 
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sensitivity loss depends on the harmonic content of the signal, which is generally not a 
priori known, but a spacing much looser than 1/H cycles will typically lose the advantage 
of going to harmonic H, while a much tighter spacing will consume resources that might be 
better employed with a search to if + 1, depending on the characteristics of the source. 

If AT is the timespan of all observations and St is the fastest timescale of interest, then 
a reasonable maximum frequency and spacing for the fundamental would be fmax^^/ {'^H5t) 
and 5f<l/{HAT). 

b) Generating the Fourier Transforms 

The calculation of ^Xmin requires evaluation of the Fourier functions P, at {0, / . . . Hf}, 
and Q, at {0,f...2Hf}. 

The real-to-complex FFT, as typically implemented, takes as input 2J\f real data 
points from uniformly-spaced discrete times t — {0, St, 2St . . . {2N — l)5i\. If the data 
were not sampled at those exact times, they are 'gridded' (e.g, by interpolation, or by 
nearest-neighbor sampling) to estimate what the data values at those times would have 
been. The output of the FFT is the H complex Fourier components corresponding to 
frequencies / = {0, 5f, 25 f . . .U5f}, where 5f = l/{2J\f5t). 

To provide the frequency range and density required for the Fx^ method, the 
weighted-data and the weights are placed in sparse, zero-padded arrays, with 5t bin width, 
that cover at least 2H times the observed time period. This produces discrete functions 
(using hat symbols to indicate discrete quantities) 

=Y.^hiAr-^ (24) 
-T.AiX]j. (25) 

In this notation, hatted times are integer indices offset by a starting epoch, Tq < all ti, 
scaled by St, and rounded down to the lower integer: i — floor((t — TQ)/St); and S is the 
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Kronecker delta: 

{1 m = n 

(26) 
m ^ n 

For numerical reasons, the measurements are adjusted by the mean value, 

and the value of Aq found by the algorithm should have x added back to find the true mean 
value of the source. 

When implemented, these discrete functions can be represented as arrays with 
indices [0 . . . 2A/' — 1] • To search an acceptable density of frequencies, you should choose 
{M / H)5t^/S.T . Most practical implementations of the FFT place requirements on M for 
the sake of efficiency, such as being a power of 2 or having only small prime factors. 

The real arrays p[t\ and q\t\ are then passed to an FFT routine to get the complex 
arrays P[f] and Q[f]. The discrete frequency indices / = 0...A/'— 1 correspond to 
frequencies / = f5f. (Many FFT implementations use the imaginary part of the / = 
element to store the cosine component at the Nyquist frequency fnyquist = A/'5/.) 

c) Calculating the normal equations at each frequency 

Equations describe how to construct a and j3. The a matrix at a given / is 

based on the terms of Q at indices {0, /, 2/ . . . 2Hf}. The /3 vector is based on the terms 
of Pat {0j,2f...Hf}. 

Streamlining the notation so that Q[nf] = -jiQj^ + i xQn P[nf] = nPn + i iPn, the 
H = 2 case can be written: 
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'a = - 
2 



2 -rQq 2 xQ\ 2 T^Qi 2 2:(52 2 7^(32 

2 xQ\ nQo ~ ■R.Q2 1Q2 nQi ~ nQs 1Q3 + iQi 

2 -R-Qi 1Q2 nQo + nQ2 1Q3 ~ xQi nQi + nQz 

2 Q^i ^ 7^Q3 x'^s ~ jQi TzQo ^ tiQa 1Q4 

^2 j^Q^ xQ^+xQi nQi'^ nQi iQa nQo'^nQi ) 



( p \ 
n^o 



(28) 



nPi 
1P2 



(29) 



\ nP2 J 

The minimum ^ach frequency is less than or equal to that for the constant value 

Null Hypothesis: 

{xi - xY 



Xo 



-I 



(30) 
(31) 
(32) 



The values of the Fourier Coefficients, -^A^ = Sfe(^ct~^)jfc^/3fe, are not required for 
finding the minimum. However, they will typically be calculated as an intermediate result 
and may be used for further analysis. 

e) Interpreting the result 

The value of / with the largest value of ^Ax^ = Ylij J ^~^)jk^ h (and thus the 
lowest x^) provides the best fit among the searched frequencies. 
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The ^ Ax^ value tells how much better the model at that frequency fits the data than 
the constant-value Null Hypothesis does. ^Ax^ must be compared to what is expected by 
chance, given the number of additional free parameters in the model {2H) and the number 
of trials (representing independent frequencies searched) . 

The number of independent frequencies searched can be made arbitrarily large by 
decreasing the gridding interval, and so any given x^improvement can in theory be diluted 
away to insignificance. However, the number of trials is only unbounded towards higher 
frequencies. If you search starting at low frequencies, the number of trials 'so far' can be 
treated as being proportional to /. In that case, the value to be minimized is 

p{f) = mH,>AA'^x') (33) 

where V2h,>Ax'^ the cumulative x^distribution with 2H degrees of freedom. p(/) is 
proportional to the probabihty, given the Null Hypothesis, of finding such a large decrease 
in x^by chance at a frequency / or below. 

The use of / to adjust V is straightforward from a frequentist perspective. Prom a 
Bayesian perspective it corresponds to a prior assumption that the distribution of log{f) 
is uniform (which is equivalent to log{period) being uniform) over the search interval. A 
different adjustment could be made if a different Bayesian prior were desired. 

Because the true minimum might fall between two adjacent frequency bins, and 
because the gridding of the data causes some sensitivity loss, frequencies in the vicinity of 
the minimum, and in the vicinities of other frequencies that have local minima that are 
almost as good, should be searched more finely. These searches should use the ungridded 
data to directly calculate a locally-optimum ^Ax^ and adjustments. 

Multiple candidate frequencies can be extracted and examined to see if they can reject 
the Null Hypothesis using other statistics in combination with ^Ax^ . For example, in 
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a search for stars with transiting planets a particular shape of light curve is expected. 
Finding an otherwise marginal ^Ax^ in combination with this light curve shape would be a 
convincing detection. Fx^ can quickly find all candidate frequencies with marginal or better 

The reduced-x^, the ratio of the minimized x^to the degrees of freedom, is a useful 
measure of how well the data fits the model. However, even the correct frequency can 
produce a poor reduced-x^under several circumstances. The source may have a light 
curve that has sharp features, or is otherwise poorly-described by an if-harmonic Fourier 
series. The source may have 'noisy' variations in addition to periodic behavior. There 
may be multiple frequencies involved, as in Blazhko effect RR Lyraes, overtone Cepheids, 
or eclipsing binaries where one of the components is itself variable. In all these cases, 
the detection of a period can provide a starting point for further analysis of the source's 
behavior. 

Because the source may be noisier (compared to the model) than the measurement 
error, an adjusted Ax^ niay be defined 

.2 _ xl- X\P) 

For a noisy source, this will allow significance calculations to be based on the noise of the 
source, rather than the measurement error, preventing false positives due to fitting the 
source noise. For truly constant sources, and for sources that are accurately described 
by the harmonic fit, the measurement error will dominate. This is an improvement 
over the traditional method of using the standard deviation of the data as a surrogate 
for measurement error because it continues to incorporate the known instrumental 
characteristics and because it does not overstate the source noise of a well-behaved variable 
source. For any given fit, the period with the best Ax^ will also have the best Axl^j- 

The best-fit frequency is not necessarily the 'right answer'. There are several effects 
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that can produce x^minima at frequencies that are not the frequency of the physical system 
being studied. Some examples of this are shown in §4. For example, a mutual-eclipsing 
binary of two similar stars may be better fit at 2f orbit than at f orbit for a given H. 

If the set of sampling times has a strong periodic component, then this can produce 
'aliasing' against periodic or long-term variations. Irregular sampling improves the behavior 
of x^-based period searches. For ahas peaks to be strong, there must be some frequency, 
f sample fof which a large fraction of the sampling times are clustered in phase to within a 
small fraction of the true source variation timescale. For many astronomical datasets taken 
from a single location only at night, this is the case with f sample — 1/day or 1/siderealday. 
Depending on the observation strategy (e.g, observing each field as it transits the meridian 
vs. observing several times a night) the clustering in phase can be tighter or looser, and 
thus produce greater or lesser aliasing at short periods. 

If there is long-term variation in the source, then there may be alias peaks near f sample 
and its harmonics, even if the long-term variation is aperiodic. If the source combines a 
high-frequency periodicity with a higher-amplitude long-term aperiodic variation, then the 
alias peaks can provide the lowest x^- To detect the short period variation, the long-term 
variation may be removed with, e.g, a polynomial fit (although, as discussed in §4, this 
will not necessarily result in an improvement). An initial run of the Fx^ algorithm with 
5t — 1/ f sample Can be used to quickly test whether the long-term variations are themselves 
periodic. 

One advantage of x^methods over Lomb is that they do not produce alias peaks 
near f sample if the source is constant. Near f sample, the limited phase coverage of the 
samples provides only weak constraints on the amplitude of the Fourier components. 
Statistical fluctuations can produce large amplitudes (and thus high Lomb values), but the 
correspondingly large standard error on the amplitude ensures that such fits do not yield a 
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large improvement in 



4. Reference Implementation And Examples 

An implementation of the algorithm, written in C, is available at the aut hor's website 



For m aximum portability and clarity, it uses the Gnu Scientific Library (GSL, iGalassi et al. 



20031 ) and includes a driver interface that reads the input data as ASCII files. 



The processing speed migh t be improved by using a different FFT package, such as 



FFTW fIFrigo k Johnson 



20051 ) . by stripping the GSL and CBLAS layers from the core 
BLAS routines, and by using binary 1/0. If many sources are measured at the same set 
of sampling times and have proportional measurement errors (e.g, an imaging instrument 
returning to the same Field Of View many times) then the same ^ matrix may be reused 
on the ^ (3 vectors of all sources. Additional speed-ups are also possible. 

However, before implementing such optimizations, users should balance the potential 
efficiency gains against the cost of the modification and determine whether the reference 
implementation is already fast enough for their purposes. 

As a test case, the referen ce implementation was applied to the Hipparcos Epoch 



Photometry Catalogue (HEP) (Ivan Leeuwen 



1997h . 



Although the primary mission of Hipparcos was to provide high accuracy positional 
astrometry, it also produced a large easily-available high-quality photometry dataset. The 
processed data includes 'Variability Annex 1' (hereafter VAl), listing variables with periods 
derived from the the Hipparcos data, or periods from the previous literature consistent 
with the Hipparcos data. The HEP was later reprocessed by several groups, such as 



^http : //public . lanl.gov/palmer/fastchi .html 



- 16 - 



Koen &: Ever I (120021 ). to discover additional periodic variables. 



The complete dataset of > 10^ stars, averaging > 10^ measurements each, spanning 
> 10^ days, takes < 10 hours to search to frequencies up to {2hours)^^ and if = 3 on 
a standard desktop computer (a 2006 Apple Mac Pro with 2x2 Intel cores at 2.66 GHz). 
This is a factor of 2-3 times slower than the 'Fast Computation of the Lomb Periodogram' 
code in Numerical Recipes §13.9, with comparable parameters for the search space, using 
the same computer and compiler. Profiling the code during execution finds that the bulk of 
the CPU time is split roughly evenly between the FFT and the linear regression stages. 

Of the 118218 stars in the HEP, 115375 had sufficient high quality data to be processed 
by the reference implementation. (Most of the others had interfering objects in the field of 
view.) Of these, 2275 had periods listed in VAl, Pvai, and had at least 50 measurements 
with no quality fiags set spanning more than 3 x Pvai- 

In the subset of 2275 VAl stars, 2066 (88.5%) had calculated periods that either agreed 
with (50.0%) or were harmonically related to Pvai- In descending order of incidence, these 
harmonic relations were 2 x Pvai (15.7%); 1/2 x Pvai (12.8%); 3 x Pvai (9.5%); and 
3/2 X Pvai (0.6%). The other 262 stars (11.5%) had calculated periods that were unrelated 
to the HEP value. 

The reference implementation has the ability to detrend the the data with a polynomial 
fit (to search for periodic variation on top of a slow irregular variation). However applying 
this to the HEP data, removing variation out to t^, decreased the agreement with the VAl 
periods, increasing the number of harmonically unrelated periods from 11.5% to 16.9%, 
and decreasing the number at Pvai from 50.0% to 42.4%. Koen & Eyer recommend that if 
analysis with and without such detrending find different periods, then the periods should 
be considered unreliable. 
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Figure [T] shows the cumulative distribution of ^X^adj HEP data and for 

randomized data. This demonstrates a well-behaved false-positive rate when observing 
constant sources. Excluding the VAl stars (representing 2% of the population) a further 
~10% of the HEP stars are clearly distinguishable from randomized data by Fx^ algorithm. 
This indicates variability in these stars, although they do not have to be strictly periodic as 
long as there is significant power at some period. 

Figure [2] shows a comparison of the results of the Fx^ and Lomb algorithms for an 
example star, HIP 69358, which is listed as an unsolved variable in the Hipparcos catalog. 



A single strong 



Otero fc Wils 



peak at -Ppx^ = 2.67098(i is consistent with the period of 2.67096(i found by 



(120051 ). However, the Lomb algorithm finds 12 peaks with higher strength 
than the one at that period. As seen from the folded light curves at the strongest PLomb 
and Pyx^i found the characteristic light curve of an eclipsing binary while Lomb found 
a noise peak. 

Cases where Fx^ does not find the fundamental period are useful for examining the 
limitations of this algorithm. Some examples of this are presented in Figure H] and discussed 
in its caption. 



5. Conclusion 

The Fast x^technique is a statistically efficient, statistically valid method of searching 
for periodicity in data that may have irregular sampling and non-uniform standard errors. 

It is sensitive to power in the harmonics above the fundamental frequency, to any 
arbitrary order. 

It is computationally efficient, and can be composed largely of standard FFT and linear 
algebra routines that are commonly available in highly-optimized libraries. 



A reference implementation is available and can be easily applied to your data set. 
Facilities: HIPPARCOS 
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100 1000 10000 100000 

^Xadf for H=3 



Fig. 1. — Cumulative distribution of /^xldj f*^^ HEP stars with known periods from VAl 
(solid line) and those without VAl periods (dashed hne) as a fraction of all HEP stars. To 
test the false positive rate, the analysis was repeated on two randomized samples of the data. 
In the Bootstrap shuffle (short-dashed line) , the data for each star was shuffled so that each 
measurement value and its error estimate, ± (jj, was randomly assigned to a measurement 
time ti. For the Random Gaussian (dotted line), the value at each ti was replaced by random 
variables with constant mean and standard deviation C ± a. 
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Fig. 2. — Strength of the signal as a function of period for HIP 69358 for F^^ (top) and 
Lomb (bottom) algorithms. The Lomb peak corresponding to the Fx^ peak is circled. 
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Plomb = 2-31 85 Pp^2 = 2.67098 



Fig. 3.— HEP data for HIP 69358 folded at the periods found by Lomb (left) and Fx^ 
(right) algorithms. 
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Fig. 4. — Light curves of various stars folded at Pvai and Ppx^ to demonstrate some of tiie 
ways that Fx^ can find a period other than the fundamental of the physical system. Star 
classification for these stars comes from the Simbad Database, operated at CDS, Strasbourg, 
Prance (http://simbad.u-strasbg.fr) a) HIP 109303 AR Lac is an RS CVn ecHpsing 
binary with Pvai = 1 -983180? which Fx^ best-fits at half the period. Due to the narrow 
minima, the H — 3 Fourier expansion at the fundamental, {If, 2/, 3/}, provides a worse 
fit than the fit at half the period {2/, 4/, 6/}. However, the different depths of the two 
minima are clearly distinguishable, showing unambiguously that Pvai is the true orbital 
period, b) HIP 10701 AD Ari has Pvai = 0.269862d. However, Fx^ finds a period 
of twice that, which appears to give a better fit to the data. This is a 5 Set type variable, 
which are characterized by multi-modal oscillations, so power at multiple frequencies is to 
be expected. c)HIP 59678 == DL Cru is an a Cyg variable. These stars have complex 
oscillations, so different cycles at the primary fundamental may have different apparent 
ampfitudes. The Fx^ algorithm to H — 3, for this data set, found a lowest x^at triple the 
fundamental period. This splits the cycles during the observation into three sets, each of 
which can have a quasi-independent average amplitude adjusted by the Fourier components, 
allowing a sfight additional x^reduction through noise-fitting. d)HIP 115647 —— DP Gru 
is an Algol-type eclipsing variable with narrow minima. A fit to the best H — 3 Fourier 
expansion is not a good model for the shape of this lightcurve, and so the Fx^ would not be 
very effective in finding this period unless H is increased. However, the HEP data samples 
only 3 minima, spread over 145 orbits, and so may not be sufficiently constraining to uniquely 
determine a period, regardless of the algorithm used. e)HIP 98546 == V1711 Sgr is a W Vir 
type variable. Pvai = 15.052d, but Fx^ gives 10.566(i. The Fx^ fit is better in the formal 
sense of having a lower x^with the line passing closer to the data points, but it does not look 
fike a W Vir light curve. To add to the confusion, the General Catalog Of Variable Stars 
(http://www.sai.msu.su/gcvs) gives P = 28.556(i, but this period is not apparent in the 
Hipparcos data. i)HIP 12557 —— W Tri, a semi-regular pulsating star, has Pvai — lOSci, 
which is the third-best value found by Fx^ (after 17.99d and 22.68d). 
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